Supplementary Materials: Accelerated Stochastic Gradient Descent for Minimizing Finite Sums

نویسنده

  • Atsushi Nitanda
چکیده

1 Proof of the Proposition 1 We now prove the Proposition 1 that gives the condition of compactness of sublevel set. Proof. Let B(r) and S(r) denote the ball and sphere of radius r, centered at the origin. By affine transformation, we can assume that X∗ contains the origin O, X∗ ⊂ B(1), and X∗ ∩ S(1) = φ. Then, we have that for ∀x ∈ S(1), (∇f(x), x) ≥ f(x)− f(O) > 0, where we use convexity for the first inequality and O ∈ X∗ ∧ x / ∈ X∗ for the second inequality. We denote the minimum value of (∇f(x), x) on S(1) by α. Since (∇f(x), x) is positive continuous, we have α > 0. For ∀r ≥ 1 and ∀x ∈ S(r), we set x̂ = x/r ∈ S(1), then it follows that f(x) ≥ f(x̂) + (∇f(x̂), x− x̂) ≥ f(x̂) + (r − 1)(∇f(x̂), x̂) ≥ f∗ + (r − 1)α This inequality implies that if r > 1 + c−f∗ α , then we have f(x) > c for ∀x ∈ S(r). Therefore, sublevel set {x ∈ R; f(x) ≤ c} is a closed bounded set. 2 Proof of the Lemma 1 To prove Lemma 1, the following lemma is required, which is also shown in [1]. Lemma A. Let {ξi}i=1 be a set of vectors in R and μ denote an average of {ξi}i=1. Let I denote a uniform random variable representing a size b subset of {1, 2, . . . , n}. Then, it follows that, EI ∥ ∥ ∥ ∥ ∥ 1 b ∑

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerated Stochastic Gradient Descent for Minimizing Finite Sums

We propose an optimization method for minimizing the finite sums of smooth convex functions. Our method incorporates an accelerated gradient descent (AGD) and a stochastic variance reduction gradient (SVRG) in a mini-batch setting. Unlike SVRG, our method can be directly applied to non-strongly and strongly convex problems. We show that our method achieves a lower overall complexity than the re...

متن کامل

Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure

Stochastic optimization algorithms with variance reduction have proven successful for minimizing large finite sums of functions. However, in the context of empirical risk minimization, it is often helpful to augment the training set by considering random perturbations of input examples. In this case, the objective is no longer a finite sum, and the main candidate for optimization is the stochas...

متن کامل

A Simple Practical Accelerated Method for Finite Sums

We describe a novel optimization method for finite sums (such as empirical risk minimization problems) building on the recently introduced SAGA method. Our method achieves an accelerated convergence rate on strongly convex smooth problems. Our method has only one parameter (a step size), and is radically simpler than other accelerated methods for finite sums. Additionally it can be applied when...

متن کامل

Stochastic Smoothing for Nonsmooth Minimizations: Accelerating SGD by Exploiting Structure

In this work we consider the stochastic minimization of nonsmooth convex loss functions, a central problem in machine learning. We propose a novel algorithm called Accelerated Nonsmooth Stochastic Gradient Descent (ANSGD), which exploits the structure of common nonsmooth loss functions to achieve optimal convergence rates for a class of problems including SVMs. It is the first stochastic algori...

متن کامل

Conditional Accelerated Lazy Stochastic Gradient Descent

In this work we introduce a conditional accelerated lazy stochastic gradient descent algorithm with optimal number of calls to a stochastic first-order oracle and convergence rate O( 1 ε2 ) improving over the projection-free, Online Frank-Wolfe based stochastic gradient descent of Hazan and Kale [2012] with convergence rate O( 1 ε4 ).

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016